分散的SGD(D-SGD)跨多个计算机(又称{\ em Nodes})分发了繁重的学习任务,将每个节点的工作负载除以系统的大小。但是,少数\ emph {byzantine}(即,行为不当)节点会危及整个学习过程。当系统为\ emph {异步}时,此漏洞将进一步扩大。尽管已经提出了赋予拜占庭式弹性的方法,但这些方法显着影响该过程的效率,甚至否定了权力下放的好处。这自然提出了一个问题:\ emph {可以同时享受拜占庭式的弹性和每个节点的工作量减少?}我们通过提出\ newalgorithm {}来确保拜占庭式弹性而不会失去D-SGD的计算效率来积极回答。本质上,\ newalgorithm {}通过使用\ emph {polyak的动量}减少本地更新中的差异来削弱拜占庭节点的影响。然后,通过通过{\ em签名的Echo广播}和{\ em最近的邻平均}方案建立节点之间的协调,我们有效地耐受拜占庭节点,同时在非拜桑丁节点之间分布开销。为了证明我们的算法的正确性,我们介绍和分析了一个新颖的{\ em lyapunov函数},该函数是由动量使用而产生的{\ em non-markovian模型漂移}。我们还通过对几个图像分类任务进行实验来证明\ newalgorithm {}的效率。
translated by 谷歌翻译
为了研究分布式学习的弹性,“拜占庭”文献考虑了一个强大的威胁模型,工人可以在其中向参数服务器报告任意梯度。尽管该模型有助于获得几个基本结果,但当工人大多是值得信赖的机器时,有时被认为是不现实的。在本文中,我们在该模型和数据中毒之间表现出令人惊讶的等效性,这一威胁被认为更现实。更具体地说,我们证明,在任何具有PAC保证的个性化联合学习系统中,每次梯度攻击都可以简化为数据中毒(我们表明这既是理想又是现实的)。这种等效性使得有可能在高度异构应用中对数据中毒的任何“强大”学习算法的韧性获得新的不可能结果,这是拜占庭机器学习的现有不可能定理的推论。此外,使用我们的等效性,我们(从理论和经验上)提出了一种实践攻击,这对经典的个性化联合学习模型非常有效。
translated by 谷歌翻译
我们研究拜占庭的协作学习,其中$ N $节点寻求统称为彼此的本地数据。数据分发可能因一个节点而异。没有信任节点,$ f <n $节点可以行为任意。我们证明,协作学习相当于新的协议形式,我们称之为平均协议。在这个问题中,节点以初始向量启动每个初始向量,并寻求大致达成一个普通的向量,它接近诚实节点初始向量的平均值。我们为平均协议提供了两个异步解决方案,每个我们都证明了根据一些维度的最佳状态。首先,基于最小直径平均,需要$ n \ geq 6f + 1 $,但实现了渐近的最佳平均常量达到乘法常量。其次,基于可靠的广播和坐标 - 明智的均值,实现最佳的拜占庭恢复力,即$ N \ GEQ 3F + 1 $。这些算法中的每一个都会引发最佳的拜占庭协作学习协议。特别是,我们的等价会产生新的不可能性定理,就任何协作学习算法在对抗性和异构环境中实现的内容。
translated by 谷歌翻译
Speech-driven 3D facial animation has been widely explored, with applications in gaming, character animation, virtual reality, and telepresence systems. State-of-the-art methods deform the face topology of the target actor to sync the input audio without considering the identity-specific speaking style and facial idiosyncrasies of the target actor, thus, resulting in unrealistic and inaccurate lip movements. To address this, we present Imitator, a speech-driven facial expression synthesis method, which learns identity-specific details from a short input video and produces novel facial expressions matching the identity-specific speaking style and facial idiosyncrasies of the target actor. Specifically, we train a style-agnostic transformer on a large facial expression dataset which we use as a prior for audio-driven facial expressions. Based on this prior, we optimize for identity-specific speaking style based on a short reference video. To train the prior, we introduce a novel loss function based on detected bilabial consonants to ensure plausible lip closures and consequently improve the realism of the generated expressions. Through detailed experiments and a user study, we show that our approach produces temporally coherent facial expressions from input audio while preserving the speaking style of the target actors.
translated by 谷歌翻译
Several face de-identification methods have been proposed to preserve users' privacy by obscuring their faces. These methods, however, can degrade the quality of photos, and they usually do not preserve the utility of faces, e.g., their age, gender, pose, and facial expression. Recently, advanced generative adversarial network models, such as StyleGAN, have been proposed, which generate realistic, high-quality imaginary faces. In this paper, we investigate the use of StyleGAN in generating de-identified faces through style mixing, where the styles or features of the target face and an auxiliary face get mixed to generate a de-identified face that carries the utilities of the target face. We examined this de-identification method with respect to preserving utility and privacy, by implementing several face detection, verification, and identification attacks. Through extensive experiments and also comparing with two state-of-the-art face de-identification methods, we show that StyleGAN preserves the quality and utility of the faces much better than the other approaches and also by choosing the style mixing levels correctly, it can preserve the privacy of the faces much better than other methods.
translated by 谷歌翻译
This paper proposes embedded Gaussian Process Barrier States (GP-BaS), a methodology to safely control unmodeled dynamics of nonlinear system using Bayesian learning. Gaussian Processes (GPs) are used to model the dynamics of the safety-critical system, which is subsequently used in the GP-BaS model. We derive the barrier state dynamics utilizing the GP posterior, which is used to construct a safety embedded Gaussian process dynamical model (GPDM). We show that the safety-critical system can be controlled to remain inside the safe region as long as we can design a controller that renders the BaS-GPDM's trajectories bounded (or asymptotically stable). The proposed approach overcomes various limitations in early attempts at combining GPs with barrier functions due to the abstention of restrictive assumptions such as linearity of the system with respect to control, relative degree of the constraints and number or nature of constraints. This work is implemented on various examples for trajectory optimization and control including optimal stabilization of unstable linear system and safe trajectory optimization of a Dubins vehicle navigating through an obstacle course and on a quadrotor in an obstacle avoidance task using GP differentiable dynamic programming (GP-DDP). The proposed framework is capable of maintaining safe optimization and control of unmodeled dynamics and is purely data driven.
translated by 谷歌翻译
大型语言模型(例如GPT-3(Brown等,2020)可以执行任意任务,而无需在仅使用少数标签示例的提示之后进行微调。可以将任意任务重新构成自然语言提示,并且可以要求语言模型生成完成,并以称为基于及时的学习的范式间接执行该任务。迄今为止,主要针对单向语言模型证明了新兴迅速的学习能力。但是,预先培训的双向语言模型(例如蒙版语言建模)为转移学习提供了更强大的学习表示。这激发了促使双向模型的可能性,但是它们的预训练目标使它们与现有的提示范式不相容。我们提出SAP(顺序自动回旋提示),该技术可以使双向模型提示。利用机器翻译任务作为案例研究,我们提示了带有SAP的双向MT5模型(Xue等,2021),并演示其少量拍摄和零照片的翻译优于GPT-3等单向模型的几个单拍翻译和XGLM(Lin等,2021),尽管MT5的参数减少了约50%。我们进一步表明SAP对问题的回答和摘要有效。我们的结果首次表明基于及时的学习是更广泛的语言模型的新兴属性,而不仅仅是单向模型。
translated by 谷歌翻译
我们提出了一种两阶段的培训方法,用于开发单个NMT模型,以翻译英语和英语的看不见的语言。对于第一阶段,我们将编码器模型初始化以鉴定XLM-R和Roberta的权重,然后对25种语言的平行数据进行多种语言微调。我们发现该模型可以推广到对看不见的语言的零击翻译。在第二阶段,我们利用这种概括能力从单语数据集生成合成的并行数据,然后用连续的反向翻译训练。最终模型扩展到了英语到许多方向,同时保持了多到英语的性能。我们称我们的方法为ecxtra(以英语为中心的跨语言(x)转移)。我们的方法依次利用辅助并行数据和单语言数据,并且在概念上很简单,仅在两个阶段都使用标准的跨熵目标。最终的ECXTRA模型对8种低资源语言的无监督NMT进行了评估,该语言为英语至哈萨克语(22.3> 10.4 bleu)以及其他15个翻译方向的竞争性能而获得了新的最先进。
translated by 谷歌翻译
在不同的成像方式上建立自称的语义对应是一项基础但强大的计算机视觉任务。当前的多模式注册技术最大化手工制作的域间相似性功能,在建模非线性强度关系和变形方面受到限制,并且可能需要重新工程或在新任务,数据集和域配对上进行大量重新设计或表现不佳。这项工作提出了反合,这是多模式变形注册的一种无监督的对比表示学习方法。通过将学习的多尺度局部贴片特征投射到共同学习的域间嵌入空间上,Cortareg获得了对非刚性多模式对齐有用的表示形式。在实验上,与新生儿T1-T2脑MRI登记任务上的一系列基线和消融相比,通过在一系列基准中进行平滑且可逆的变形,实现了准确,稳健的结果,并在广泛的变形正则化强度范围内验证了所有方法。
translated by 谷歌翻译
根据线性随机微分方程进化的扩散过程是连续时间动态决策模型的重要家族。最佳政策对它们进行了充分研究,并确定了漂移矩阵。然而,对于不确定的漂移矩阵的扩散过程的数据驱动的控制知之甚少,因为常规离散时间分析技术不适用。此外,尽管该任务可以被视为涉及探索和剥削权衡取舍的强化学习问题,但确保系统稳定性是设计最佳政策的基本组成部分。我们确定流行的汤普森采样算法可以快速学习最佳动作,仅产生了时间根的遗憾,并在短时间内稳定了系统。据我们所知,这是汤普森在扩散过程控制问题中抽样的第一个结果。我们通过从两个飞机和血糖控制的两个设置的实际参数矩阵的经验模拟来验证理论结果。此外,我们观察到,与最先进的算法相比,汤普森采样显着改善(最坏的)遗憾,这表明汤普森采样以一种更加保护的方式探索。我们的理论分析涉及特定的特定最优歧管,该歧管将漂移参数的局部几何形状与扩散过程的最佳控制。我们希望这项技术具有更广泛的兴趣。
translated by 谷歌翻译